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THE SHARP WEIGHTED BOUND FOR GENERAL 
CALDERON ZYGMUND OPERATORS 

TUOMAS P. HYTONEN 



Abstract. For a general Calderon-Zygmund operator T on R^, it is shown 
that 

l|T/|L2(„) < C{T) snp^j wj w-^) ■ ||/||i2(„) 

for all Muckenhoupt weights w £ A2 ■ This optimal estimate was known as the 
^j^ A2 conjecture. A recent result of Perez-Treil-Volberg reduced the problem to 

^~N , a testing condition on indicator functions, which is verified in this paper. 

\^ ■ The proof consists of the following elements: (i) a variant of the Nazarov- 

Treil— Volberg method of random dyadic systems with just one random system 
and completely without "bad" parts; (ii) a resulting representation of a gen- 
J^ , eral Calderon-Zygmund operator as an average of "dyadic shifts"; and (iii) im- 

provements of the Lacey-Petermichl-Reguera estimates for these dyadic shifts, 
which allow summing up the series in the obtained representation. 

. 1. Introduction 

> ; 

^D , Let T e ^(i-^(R^)) be a fixed Calderon-Zygmund operator, i.e., one with the 

iL: ' integral representation 



T!{x) ^ / K{x, y)f{y) dy, x i supp /, 

for a kernel K{x, y), defined for all x ^ y on M^ x R^, and verifying the standard 

C 
estimates |iir(a;, y)| < -j j-;^ and 

\K{x + h,y)- K{x, y)\ + \K{x, y + h) - K{x, y)\ < ^^ ^'^'^^^ 



^ for all \x — y\ > 2\h\ > and some fixed a G (0, 1]. Let w S Ll^^(R.^) be positive 

almost everywhere. It is classical that the Muckenhoupt condition 



llwlJAa •= sup -T wdx ■ 'r w ^ dx < 00, 
Q Jq Jq 



where the supremum is taken over all cubes Q C M^, is both sufficient for the 
boundedness of all such T on L'^{w), and necessary for the i^(u')-boundedness of 
some particular operators T, like the Hilbert transform for A^ = 1. 

Recently, the precise dependence of the ^{L'^{w)) norm of Calderon-Zygmund 
operators on the Muckenhoupt characteristic Huijlyia has attracted interest, and the 
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2 T. P. HYTONEN 

following bound, optimal in general, has become known as the A2 conjecture: 

WTfWm...) < CiT)\\w\\Ajf\\LH^). (1.1) 

By the sharp form of Rubio de Francia's extrapolation theorem due to Dragicevic, 
Grafakos, Pereyra and Petermichl [6J, this implies the corresponding weighted L^ 
bound, 

\\Tf\\L.M<C,iT)\\w\r2f''/^^-'^^\\f\\L.M, pe (1,00), (1.2) 

where 

\\w\\ap ■— snp 4' wdx-l'l' w^"^/'^^"^-' dx) 

Here is a brief description of past progress on this problem. It concentrates on 
the research on Calderon-Zygmund-type operators, for which the conjectured sharp 
bounds are given by p.ip and (|1.2p . but many other kinds of operators, sometimes 
with different dependence on the weight, have also been considered in the literature. 

(1) Although not strictly a Calderon-Zygmund operator, the Hardy-Littlewood 
maximal operator M is clearly closely related, and the sharp weighted line 
of research was opened by Buckley |3], who proved p.ip for T = M. (For 
Af , the right power of \\w\a^ in (EH) is l/(p - 1) for all -p e (1, 00).) 

(2) Resolving a conjecture by Astala-Iwaniec-Saksman [1] Eq. (45)] with impli- 
cations to Beltrami equations, the case of the Beurling-Ahlfors transform 
B G ^(L^(C)) was first settled by Petermichl and Volberg [24], and with 
an alternative proof by Dragicevic and Volberg [7]. Petermichl also ob- 
tained the sharp bounds for the Hilbert transform H G ^(L^(R)) [22] . 
and then for the Riesz transforms Ri G ^(L^(M^)) in arbitrary dimen- 
sion N G Z+ [23]. All these results relied on ad hoc representations based 
on specific symmetries of the operators in question, and Bellman function 
arguments tailor-made for each particular situation. 

(3) A unified approach to the earlier results for B, H and Ri was found by 
Lacey, Petermichl and Reguera [T^, who proved (|1.1|) for a general class of 
"dyadic shifts", from which all the mentioned operators may be obtained by 
suitable averaging. The original proof employed a two-weight inequality for 
dyadic shifts due to Nazarov, Treil and Volberg [20]. It was substantially 
simplified by Cruz-Uribe, Martell and Perez [5], based on a remarkable 
formula of Lerner |15j , which gives very precise and useful information on 
a function in terms of its local mean oscillations. 

(4) Vagharshakyan [25] found a way of recovering all sufficiently smooth, odd, 
convolution-type Calderon-Zygmund operators in dimension A'^ = 1 from 
dyadic shifts, thereby proving (|l.ip for all these operators. By a different 
method, Lerner [TB] was able to estimate all standard convolution-type 
operators in arbitrary dimension by controlling them in terms of Wilson's 
intrinsic square function [26]; however, this approach only gave (|1.2p for 



pG(l,|]U[3,«)). 

(5) The conjecture p.ip concerning a strong-type bound was reduced to proving 
the corresponding weak-type estimate (and even slightly less) by Perez, 
Treil and Volberg [21]. Based on this reduction, the first confirmation of 
(|l.ip for a general class of non-convolution operators, but imposing heavy 
smoothness requirements on the kernels, was obtained by Lacey, Reguera, 
Sawyer, Uriarte-Tuero, Vagharshakyan and the author [lO] . 
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Altogether, the A2 conjecture has now been verified in quite a number of cases. 
(Note that no two of the just mentioned results of Vagharshakyan [25], Lerner |16j . 
and Lacey et al. [10] are strictly comparable.) And in this paper, the problem is 
completely solved. Besides, the proof is based on quite general metric-measure- 
theoretic objects (as opposed to the use of convolutions and regular wavelets in the 
preceding contributions), which makes it likely to extend to further situations like 
spaces of homogeneous type; see the discussion at the end of the paper. 

1.3. Theorem. The estimate ()l.ip . and hence p.2p . holds for all Calderon-Zygmund 
operators T G ^{L^{M.^)), for all N eZ+. 

Just like the recent result of Lacey et al. [lOj, the proof relies on the reduction 
of Perez, Treil and Volberg [5T]. For an arbitrary Calderon-Zygmund operator T, 
they proved that 

\\T\\j^{L^w)) < C(T)(^\\w\\a^ + sup --——j||T*(u;1q) 11^2(^-1) 

where 'w{Q) := /„ wdx and similarly with w^^, and T* is the adjoint with respect 
to the unweighted L^ duality. Thanks to the symmetry of T and T* (both satisfy 
the same Calderon-Zygmund bounds), as well as of w and w^^ (both have the same 
A2 characteristic), the first Perez-Treil- Volberg estimate above reduces the proof 
of the A2 conjecture to showing that 

||T(i.1q)|U2(^-i) < C{T) \\w\\a, w{Qf'^ (1.4) 

for all Calderon-Zygmund operators T. (The second Perez-Treil- Volberg estimate 
will not be used here; it is only recorded for the sake of pointing out the connection 
to weak- type bounds.) 

This paper is concerned with the proof of (|1.4p . The Calderon-Zygmund opera- 
tor T will first be decomposed in terms of appropriate simpler operators. This was 
also the strategy of Lacey et al. [TU] , where the decomposition was extracted from 
the proofs of the T(l) theorems due to Beylkin-Coifman-Rokhlin j2j, Figiel [8|, 
and Xiang j27] . However, the mentioned decomposition seems not to have been op- 
timal for the A2 conjecture, as summing up the weighted estimates for the simple 
operators required a high degree of smoothness on the kernel of T. 

Thus, the first intermediate goal here is finding a better decomposition. And 
this is once again provided by the proof of a T(l) theorem — this time, the one for 
nonhomogeneous spaces due to Nazarov, Treil and Volberg [12]. (A variant of the 
same proof, from a more recent Nazarov- Treil- Volberg preprint |18| , is also behind 
the reduction of Perez, Treil and Volberg [H].) Recall that the basic philosophy 
of this proof is expanding an operator in terms of the Haar basis associated to a 
randomly chosen system of dyadic cubes; the part of the expansion living on so 
called "good" cubes can be directly estimated, and the remaining "bad" part can 
be forced to be an arbitrarily small fraction of the full operator norm. Thus the 
bound will be of the form 

ni<Cgood(r) + £bad(r)||T||, 
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where r is an adjustable parameter in the definition of good and bad cubes; in- 
creasing r will increase Cgood('') and decrease ebad(''), and it suffices, in principle, 
to make ebad(^) < 1- The problem is that, in the weighted case, the required size of 
r would seem to depend on w. So even if one could prove the desired dependence 
Cgood{i') £ c(r)||z/;m2 with c{r) independent of w. this could be spoiled by the 
necessity of taking r = r{w). 

The solution to this problem is proving that, on average, the bad part becomes 
not only small but vanishing; in other words, a decomposition of an operator T 
can be made by using Haar functions on good cubes only, with no error term 
whatsoever (Theorem l3.ip . This is an abstract result with no specific connection to 
weighted inequalities, and it will possible make the Nazarov-Treil-Volberg method 
of random dyadic systems more fiexibly applicable to further questions. Another 
modification of the original randomisation argument is the use of only one random 
dyadic system, rather than two independent copies. In this way, there will be a 
stronger dyadic structure around, which is certainly a convenience, if not a necessity, 
for the subsequent considerations. 

Once the full reduction to good cubes is available, the proof proceeds along the 
lines of the analysis of the good part in the Nazarov-Treil-Volberg T(l) theorem 
[TOJ , to extract several subseries of the Haar expansion, which are identified as new 
operators on their own right. These auxiliary operators are already implicit in the 
original Nazarov-Treil-Volberg argument [19|, and their more explicit form was 
identified in my extension of their result to the vector- valued situation [9J , where 
this explicit structure became more decisive. Here, it will be checked that these 
new operators are precisely the dyadic shifts in the generality defined by Lacey, 
Petermichl and Reguera |13j . Thus, closing the circle with the pioneering sharp 
estimates for the classical integral transforms, it is proven here that all Calderon- 
Zygmund operators may be written as averages of dyadic shifts (Theorem 14. 2p . In 
fact, and this technical issue will be important for the final steps of the proof, one 
only needs so called good dyadic shifts, where this goodness is closely related to 
the goodness of dyadic cubes. 

The final task, then, is proving a version of the estimate (|1.4p for the good 
dyadic shifts in place of T. For individual shifts, this estimate has been established 
by Lacey-Petermichl-Reguera [13] , with a simplified proof by Cruz-Uribe-Martell- 
Perez [5j; however, their arguments give a dependence on certain parameters of 
the shift, which grows too rapidly to allow summing up the estimates in the series 
representation of T in terms of these shifts. Appropriate improvements of these 
bounds will be established in the final part of the paper (Theorem 16. ip . Despite 
the elegance of the Cruz-Uribe-Martell-Perez argument |5j, I did not manage to 
modify it for the required sharpness, and the new estimates will follow instead the 
general outline of the original Lacey-Petermichl-Reguera proof [13J. 
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2. Preliminaries 
2. A. Systems of dyadic cubes. The standard dyadic system is 

^°:= [j&°, %°:={2-'=([0,l)^ + m):meZ^}. 

feez 

For / e ^^ and a binary sequence /3 = {(3j)°l^^ G ({0, 1}^)^, let 

I+I3:=l + J2l3j2~^- 
]>k 

Following Nazarov, Treil and Volberg ^19,, Section 9.1], I will consider general dyadic 
systems of the form 



^ = ^/3 := {/+/? : / G ^0} = (J 
Given a cube / = a; + [0, ^)^, let 






ch(/) := {x + r^e/2 + [0, e/2f : 77 G {0, 1}^} 
denote the collection of dyadic children of /. Thus &^^i = {J{ch{I) : I G ^f }. 

2.B. Conditional expectations. The local conditional expectation operators and 
their differences are denoted by 

Eif:=li{f)i:=li-ffdx:^li^ffdx, D// := V Epf-Eif, 

■'' 1^1^^ /'ech(/) 

and then 

Often, the parameter /3 will be understood from the context, and the superscript /3 
dropped from this notation. 

For / G Lj'qj.(K^), Lebesgue's differentiation (or martingale convergence) the- 
orem asserts that E^/ — > / almost everywhere, as A: ^^ cxd. Since the Efc/ are 
dominated by the Hardy-Littlewood maximal function Mf, this convergence also 
takes place in L'^{w), as soon as / G L'^{w) and w G A2. This leads to the martingale 
difference decomposition 

n 

f = lim E„+i/ = E„J + lim V ©,-/ 

(2.1) 

= EiEz/+,im^EEi^^/ 

valid for any m E Z. The number m will be considered fixed throughout most of 
the arguments. By abuse of notation, the operator D/ will be redefined as D/ + E/ 
for / G !^m', then the identity (|2.ip attains a simpler form without the first sum on 
the right. 
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2.C. Haar functions. Sometimes it is useful to write the operators ED/ and E/ in 
terms of Haar functions h^, rj G {0, 1}^, which satisfy 

snpph]CI, /i^|//= const V/' G ch(/), ||/i/'||oo < l/p^''^ 

as well as 

Jh^hU^^S^g, /i? := 1/1-1/21,. 

(The precise definition of h^ for 77 7^ may be done in a variety of ways, and is not 
important for the present purposes.) Then 

ve{0A}«\{0} 

2.D. Random dyadic systems; good and bad cubes. Choosing a random 
dyadic system simply amounts to a random choice of the parameterising binary 
sequence f3 — {f3j)ji^z, according to the canonical product probability measure P^ on 
({0, 1}^)^ which makes the coordinates I3j independent and identically distributed 
with Pp{f3j = r]) = 2^^ for all rj G {0, 1}^. The symbol E^ denotes the expectation 
over the random variables /3j , j E Z; I will also use conditional expectations of the 
type E^[- |/3j : j G ^], which means (as usual) that the variables f3j, j E ^ , are 
held fixed, and only those /3j with j G Z \ ^ are averaged out. 

Following Nazarov, Treil and Volberg, a dyadic cube / will be called bad, if it 
is relatively close to the boundary of a much bigger dyadic cube. However, only 
one dyadic system rather than two will be considered at a time here, so / will 
be compared with bigger cubes of the same dyadic system. More precisely, given 
parameters r G Z+ and 7 G (0, ^), a cube / G .^ is said to be bad if there exists 
a J G ^ with i{J) > T'til) such that dist(/,aj) < (.{I)'' (.{Jf-'< . Otherwise, / is 
said to be good. 

A pair of cubes (/, J) G i^ x i^ is said to be good, if the smaller cube, say /, 
satisfies 6isi{I,dK) > e{iy£{K)^-'^ for all K & & with 2''£{I) < e{K) < (.{J). 
(Note that the condition is trivially true for i{J) < 2^£{I).) 

In the treatment of a Calderon-Zygmund kernel with Holder exponent a, the 

choice 7 := — — is useful. In the sequel, some simple algebra involving this 

2[N + a) 

number will take place every now and then; however, the reader should not be 
misled to think that this precise choice is particularly critical. I have made this 
choice mainly because (i) it works and (ii) it is the one chosen by Nazarov-Treil- 
Volberg and used in several papers by now. However, any smaller 7 (depending 
only on a and N) would work equally well. 

The cubes of i^'^ will be often explicitly considered in the form I+/3, with / G i^°. 
Under this parameterisation, it is important to observe a fundamental independence 
property regarding goodness. First, by definition, the spatial position of 

/+/3 := / + J2 2"'^^- 
r.2-:i<e(i) 

depends only on (3j for 2~^ < 1{I). Second, the relative position of /+/3 with 
respect to a bigger cube 

0:2-3<l(I) j-l{I)<2-i<l{J) 
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depends only on f3j for £{1} < 2^^ < £{J). Thus, the position and goodness of I+(3 
are independent. 

It is an immediate consequence of symmetry that the probabihty of a particular 
cube I € & being bad is a number depending only on r, 7 and N. This number, 
TTbad, maybe easily estimated as TTbad ^7.^ 2"'"''. (Thanks to the above mentioned 
independence of position and goodness, the computation is only slightly different 
from the case of two independent random systems considered in [19J.) In much of the 
earlier work based on good and bad cubes, it was important that this number can 
be made as small as one likes by fixing r large enough, and the requirements for its 
magnitude depended on the implicit constants in certain square function estimates. 
Here, it will only be necessary to choose r large enough so that TTbad < 1, hence 
TTgood := 1 — TTbad > 0, which is a simple geometric condition. 

2.E. Notational conventions. The proof of the A2 conjecture is now about to 
start. It will deal with a measure w € A2 and its dual measure a := w~^, which 
has the same A2 characteristic HuiJIas = llo'lUa- 

In the estimate to be proven, the precise dependence on the weight w is decisive, 
so such dependence will always be indicated explicitly. On the other hand, the par- 
ticular dependence on the properties of the arbitrary but fixed Calderon-Zygmund 
operator T will be unimportant. Accordingly, the shorthand A < B will be used 
for A < C{T) B, where C{T) is any finite quantity depending at most on T. Here 
it is understood that the operator T carries with it. in particular, the information 
on the dimension N of the domain M^, as well as a Holder exponent a and the 
related constant C from the standard estimates verified by its kernel. The number 
7 and a suitable choice of r only depends on these quantities. 

3. The good martingale difference representation 

The representation result to be proven in this section is of an abstract nature, 
as the reader will easily realise, but the aim of its formulation below will not be 
the maximal generality, but rather the weighted application at hand in the present 
paper. Consider an integer m fixed, while n is a variable, which is taken to approach 
infinity. A summation over some intervals I ^ J' ^ with the additional restriction 
that 2"" < £{I) < 2"™, will be abbreviated as 

n 

E -E- 

ley ley 

2-"<£(/)<2"" 

It will not quite be true that only good cubes are needed in the representation; 
however, it can be arranged that the bigger cube in any required pairing 

Tji -.^ {Djg,TDjf) 

is always good, and also the pair of cubes is good, meaning that the smaller cube 
stays away from the boundaries of the bigger cubes up to the size of the bigger 
cube, and this slightly restricted joint goodness will be enough for the subsequent 
considerations. 

An intermediate form between the original random martingale difference decom- 
position of Nazarov, Treil and Volberg [19J and the present formulation is found 
in the proof of my vector- valued nonhomogenenous Tb theorem [9J , although there 
still with two independent dyadic systems. 
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3.1. Theorem. Let T e ^(L^(w)) and f e L'^{w), g e i^(cr) he compactly sup- 
ported. Then the following representation is valid: 

{g,Tf) ~2 



good 



lini 



7i 

^^ [ E '^J+PJ+P ^f^ [lg°od(« (^+/3) : ft : 2-^' < ^( J)] lgood(/^) ( J+/3) 
i.Jes>° 

n 
l(J)<l(I) 



= lim E^ V (©jg,TB7/)-7r7j, 



bigger cube good 
pair (/, J) good 

where ttjj £ [0, 1] are the values of the conditional probabilities on the previous lines 
after reindexing the summation in terms of '3^ . The last summation condition is 
short hand for the requirement that the cube J is good if £{J) > i{I), the cube I is 
good if £{I) > i{J), and the pair of cubes (/, J) is always good. 

The rest of this section is concerned with the proof of this theorem. Observe 
first that 

(g, Tf) = {g, TErJ) + {g, T(/ - E„/)), 
where the second term satisfies 

|(5,T(/-E„/))| < |I.g|lL2(,)||T||^(i2(^))||/-E„/|U2(^), 

and the last factor is dominated by C(w)||/||i2(^^,) and tends to zero as n — 5- oo. 
(At this point, the precise dependence of C{w) on the weight is of not important.) 
By dominated convergence, also the expectation over the different dyadic systems 
of this quantity tends to zero as n ^ oo. Thus 

(5,r/)=E^(,g,TE„/)+£„, 

where e„ ^^ as n ^- oo. I keep using e„ in this meaning; it need not be the exact 
same quantity on each occurrence. The compact support of / ensures that E„/ is 
the finite sum 

n 

E„/ = ^ Dif; 

recall that D/ is abuse for D/ + E/ when £{I) = 2"". 

Now I investigate the effect of the expectation Kp in more detail. Since Dj_j_^/ 
depends only on f3j for 2^-' < £{I), whereas the goodness of I+f3 depends on the 
complementary parameters l3j for 2~^ > (.{I), there holds by independence that 

E^[(5,™,+^/)lgood(/^)(/+/3)] - E;3[(ff,TI]),+^/)] • E0[lgood(/^)(/+/3)] 

= E^[(g,™,+^/)]-^good 

and hence 

n n 

E^(5,rE„/) =E^ ^ {g,Tnif) = ^ E^(g,™,+^/) 
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1 " 

= E IEM(.9,™/+^/)lgood(/3)(/+/3)] 

1 " 

= E^ E (5,™//). 



^good 

Moreover, writing 9 — E^^y + (^ ~ ^n^)? it follows that 

n n n n 

where the last term is dominated by 

llff - E„g||L2(„)||r||j5f(L2(i„))C(w)||/||L2(^), 

and the first factor is bounded by C(w)||(7||/^2(o.) and tends to zero as n — > oo. By 
dominated convergence again, it follows that 

_. n n 

good 

I keep manipulating the double sum, making use of the dependence of the various 
random quantities on the different parameters /3j, as well as basic properties of 
conditional expectations. There holds 



E^ E E {^J9,Tnif) 



good 



1 



E E + E E ){^J9,TBif)^:A + B, 



"""^ e(J)>e{i) """^ e(J)<e{i) 

and further 

1 " 

A^ E E^ [(ro 7+0.9, ™,+^/)-lgood(/^) (/+/?)] 



l(J)>i(I) 
1 " 

= E %[(roj+/.5,™/+/./) -lE^lgoodi/^) (/+/?) 1/3, : 2-^ <^(J)]], 

^s°°'' /.JG®« 
HJ)>i(i) 

where the first factor inside E^ only depends on /3j for 2^^ < ^(J), which allowed 
to replace the second factor by its conditional expectation with respect to these 
variables. Let then 

""^i+mJ) :-%[lgood(0)(/+/3)|/3, : 2-^ < eiJ)]; 

by definition, this conditional probability only depends on /3j for 2^-' < £{J). As 
the goodness of J+/3 depends on the complementary variables (3j for 2^^ > £{J), 
independence may be used again to write 

E4(D,,+^.g,TP,+^/) • ^f+^^,(^) • Ep[lgoo^(p){J+m 

= E4(P^+^5,™,+^/) • 7rf^^_,(j) • lgood(/3)(J+/3)]. 
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Using this and recalling that E^[lgood(/3)(^+/^)] = TTgood, there holds 
1 " 

£(J)>£(7) 

While the conditional probability tt^- , , , is some number between and 1 in 
general, it is important to notice a particular case when it is zero: this is when 
I+ji is already bad with respect to some interval K € &^ of length at most £(J), 
in particular when /+/? is bad with respect to J+j3. Hence, if ttC^; „ ., ,, > 0, then 

(/-i-/3, J+/3) is good, and this additional restriction may be introduced without 
changing the value of the sum. Hence, reindexing in terms of ^^ again, 

_. n n 

A = ^—Ep J2 J2 (^J9, TBjf) ■ TTij, 

\ood j^^fl p 

good 

e(j)>e.{i) 

(I, J) good 

for certain numbers tt/j E [0,1], whose dependence on /3 is suppressed from the 
notation. 

In part B, simply by independence (the first factor depends on /3j for 2^^ < i{I), 
the second on l3j for 2"^ > £{I)): 



1 

iiJ)<i{I) 

iiJ)<i{I) 






J2 E0(©j+^g,T]D),+^/)=E^ J2 (IDj.9,™//). 



Altogether, it has now been shown that 



_. n n 

(9, Tf) = -^Ep V V {Bjg, TBjf) ■ ttu 



good j^^i3 ^13 

good 





(7, J) good 




+ E^ >; {Bjg,TBif)+en, 




i(J)<i(i) 


also 






n n 


{9,Tf)- 


= (E„g, TE„/) +en = EpJ2 E (^J9, TBjf) 




KJ)>Ki) 
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E^ J2 {Bjg,TBif)+Sn, 



HJ)<i{i) 



Comparing these equalities, it follows that 

n n _, n n 

^pY. J2 {B,jg,TBif) = ^—Ei3 J2 Y. Pj9,™if)-7rij + en. 
KJ)>i{i) iiJ)>i{i) 

(/,J)good 

A symmetric treatment, with the roles of / and J reversed, also shows that 

n n ^ n n 

(J,I) good 

for some further numbers tt/j G [0, 1] related to conditional probabilities as before. 
Thus 

n n n 71 

{g,Tf)=Ep(^Y. E +E E ){^.J9,TOif)+Sn 
/gS"3 je®'^ /G®'3 jegi/^ 
i{J)>i{i) i{J)<i(i) 

_. n n n n 

= ^^^^p{ll E +E E )(ID'./9,™//)-^/./+en, 

^good jg^^ p j^^ij Si 

good good 

l{J)>l{I) l{J)<l{I) 

(I, J) good {J J) good 

which is the claim of the theorem. 

4. Decomposition into dyadic shifts 

With the martingale difference decomposition of the previous section as the start- 
ing point, the next goal is to express the operator T as an average of fundamental 
building blocks called dyadic shifts. It is first in order to give a definition. Although 
expressed somewhat differently, it is essentially equivalent to that given by Lacey, 
Petermichl and Reguera flS'i Definition 1.5]. 

4.1. Definition. A dyadic shift with parameters {u,v) is an operator 



m=^A 



A'es> 



where ^ is a dyadic system and each Ak has the form 

AKf{x):='r aKix,y)f{y)dy, ||aA||oo < 1, 



K 



aK{x,y)= E E E afjKh^j{x)h1{y). 

ieS>-jcK Je&-JCK j;,ee{o,i}" 

i{I)=2^"i{K) l{J)=2^''e{K) 

A dyadic shift is called finite, if only finitely many Ak are nonzero; bounded, if 

li^A/||L^<||/||L=;andgood, if 
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and similarly with / in place of J, for all / and J for which some ajj^ is nonzero. 

Only finite shifts will be needed in the present considerations. This is a qualita- 
tive convenience, which ensures that no problems of convergence can arise; however, 
all the estimates will obviously have to be independent of the number of nonzero 
Aj^. The goal of this section is to express a Calderon-Zygmung operator as a weak 
limit of averages of good, finite, uniformly bounded dyadic shifts: 

4.2. Theorem. Let T £ J^{Lp') he a hounded Calderon-Zygmund operator (hence 
also T G J^{L'^{w)) and f G L'^(w),g G i^(c) he compactly supported. Then 

oo 

(.g,r/)= hm E^ J2 2-'"^'^("'")"/'(ff,mr^/), 

where UI^o is a good finite dyadic shift adapted to the dyadic system !^" , with 
parameters {u,v), and \\IiIl\Z f \\ 1^2 < ||/||l2 uniformly in u,v,n and (3. 

Consider the representation of {g,Tf) provided by the previous section and, for 
the moment, the part of the series with £{I) < i{J). The summation conditions 

le&P, J e i^f„„d' (/, J) good, 2-« < i{I) < £{J) < 2-™ (4.3) 

will be implicitly in force until further notice; only additional restrictions in sum- 
mation will be indicated explicitly. 

I rearrange the summation following the well-known procedure from Nazarov, 
Treil and Volberg [19] . (Also the subsequent analysis will closely follow [T9j , as well 
as [9J. Some details will only be cited from these sources.) 

/ ^ = / ^ + / ^ + 2_^ ~' ^out + Sin + Sncar- 

i{I)<i{J) dist(/,J)>£(/) dist(7,J)<^(/) dist(/, J)<f (/) 

e{I)<2-''i(J) e{I)>2-''i{J) 

When / and J are taken from the same dyadic system, as is the case here, the 
condition dist(/, J) < £{I) < (J) in fact implies that dist(/, J) = 0. 

4. A. The term Eout- For the analysis of Sout, recall the notion of the long distance 
Pl Definition 6.3] 

£»(/, J) := £(/) + dist(/, J) + e{J) 
as well as the integer-valued function [9, end of Section 5] 



0{J) 
Then 



■J7- 



1 



CXD 00 



out" 



i=a j=o e{j)=2'e{i) *j' 

2^<D(I.J)/£{J)<2^ + ^ 

For / and J appearing in Cout, using the goodness of J, one can readily check [9l 
a few lines after (7.5)], that J C /(*+J+^(j))^ where 7^*'^ indicates the k generations 
older dyadic ancestor of I: the unique I^*^) G ^ with l'-''^ D I and £(/('=)) = 2'=£(/). 
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Thus, taking K := /('+-J+^(j)) g ^P as a new auxiliary summation variable, one 
can write 

-^t - E E E =^ E -t (4.4) 

K^&P JeSi^^^^;JCK I (i3!^;I<ZK- (I. J) good K£@l^ 

2^<D{I,J)/t{J)<2^ + ^ 



The next task is to check that a^^ is of the form {q^AkJ)- Recalling the sup- 



'^im- 
pressed summands (J} j g ,TJ3 1 f) and invoking the Haar functions 



I,J 77,1 



'K -/^/^{9,K) ■ {h%Th]) . {h],f).njj, 



where the summation conditions on /, J are as in (|4.4p . while rj, 6 run over {0, 1}^\ 
{0}, except possibly when / G ^^ or J G i^^ in which case also the noncancellative 
Haar functions h'j or h^j are allowed. Also recall that ttjj £ [0,1]; no further 
properties of these conditional probabilities will be needed in the treatment of this 
part of the sum. For the coefficient {h^j^ThJj), standard kernel estimates and the 
goodness of / in the case that i{I) < 2^^i{J) give [19, Lemmas 6.1 and 6.4] 

dist(/, jy 






^(/)"/2<5(J)"/2 



J|l/2|/|l/2 



<^ 2-ia/22-ja+j'fN/{l-^) \ 



\K\ 



The above estimate depends on the fact that the Haar function hj related to the 
smaller cube / is a cancellative one. Since the noncancellative Haar functions only 
appear on generation m, the claimed fact could only fail if both £{I) = £{J) = 2^™. 
But one can choose to so small (i.e., large negative) that at most 2^ cubes of length 
2~™ intersect the union of the supports of / and g. Then all relevant pairs of cubes 
with £{I) — i{J) ~ 2^™ are less than their common sidelength apart, and hence 
they will fall into the term Sncar- 
Writing 

ayf := 2'''/22M-',N/ii-j)] . (/.e^y/,^) < |j|i/2| j|i/2/|^|^ 
there holds 



oo 

E 



2-W22-j["-7W/(i-7)l(;^^inj_:'^j)^ 



where the promised dyadic shifts Hljut are explicitly given by 

m:^t/ := E E E hWfAhiJ) =■■ E ^f- 

KG&f lesii'jesi^ y, i,j<zK v,e a'gs» 

' good' ' — 

e(j)=2''i{i)=2-'-^'-'h{K) 

2'<D{I,.J)/e(J)<2' + ^ 

The persistent summation conditions (|4.3p and the goodness of (/, J) may be in- 
corporated by simply defining some of the coefficients a]j to be zero. From the 
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estimate \alj\ < |/|^/^| J|^/^/|iir| and the size and support properties of the Haar 
functions, it follows that A^ is an averaging operator, 

A-^fix) = / 4{x,y)f{y)dy, ||a^|U < 1- 

J K 

One needs to check that Ulout '^^ ^ good shift. If J £ ^food ^-PPears in Ak, it 
is immediate from the goodness of J that dist( J, dK) > £{J)'^i{K)^~'' . For /, one 
can argue as follows: 

dist(/, dK) > dist( J, dK) - D{I, J) > ({J)'' ({K)^-'' - V+^t{J), 

and 1{J) = 1{J)^ liJ)^--^ = i{J)'<{2-^-'^^^H{K))^-^; hence 

disi{I,dK) > i{J)'-'l{K)^-^{l - 2J'+i2-^'(i-'^)-(^'''+'')) 

> £{J)'^e{K)^-'^{l - 2^-'^) > i£(J)''£(i^)i-'^, 

and£(J) >i{I). 

Now each individual IIIq^j is seen to be of the required form, but the param- 
eterisation of the series still different from the one stated in the theorem. Thus, 
let 

v:=] + e{j) = -^— + 0{l), u:=i + v 

1-7 

so that £{I) = 2~''e{K) and i{J) = 2-^£(X) for aU /, J appearing in Ak- Then 

2-j[q!-7-/V/(1-7)] < 2-i'[a(l-7)--'V7] — 2^'""/^ 

and hence 

iy — ia/2n—j[a — yN/(l — ~f)] <^ r\ — [i-\-v)a/2 r\—ua/2 o— inax(u,D)a/2 

This completes the treatment of Sout • 

4.B. The term Si„. The first basic observation is that the conditions dist(/, J) < 
£{I) < 2^''^( J) and the goodness of / imply that in fact / must be fully contained 
in (and even deep inside) one of the children J' 6 ch( J) of J. On this set, 3jg takes 
a constant value (Jijg) ji = (fiijg)i. Then, for /, J appearing in Ejn, a paraproduct 
can be extracted, as usual, 

(Dj3,rDj/) = (l(j,)cD,7g,rDj/) + {Djg)j,{lj,,TOif) 

= {^.ryiOjg - {Bjg)j,),TBif) + (Dj.g> j,(l, T©//) 

= J2(9,h'j){l^Mh'j ~ {h'j)j,),Th]){h]J) + {Bjg)j{T*l,Bjf). 
The coefficients in the first term satisfy (cf. [TQI Lemma 7.3] or P, Lemma 8.3]) 

\{hMh'j - {h'j)j'),Th])\ < (M)"/'(Mk + \ih',)j,\)\\h]\\, 

-^\£{J)J \\J\J \\J\J 

for i{I) = 2-^£{J). Altogether then. 



Si„= ^ 2— /2(g,mj„/) + ^(r*i,©,/) J2 i^J9)i-^ij, 

i=r+l I JDI 

l(J)>2^l(I) 
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where the new sequence of dyadic shifts is given by 



>: 


>; 


i. 


h$afj{h],f) 




^e®fo< 


^j /G®'^, /CJ 


nfi 








l(I)=2-'l(J) 








>; 


>; 




E E'^X^ 


ife@/3 


J'^Kaad,^ "'CK 


' ieS)i^. ic.J i.e 






^(,/)=2"''f(i<') 


e{iy. 


^2-i-'-^(^) 





{hlf)=: J2 ^Kf. 

The middle equahty follows by simply introducing the new summation variable 
K := J^^'. Again, the implicit summation conditions (|4.3p are also in force, but 
may be suppressed by defining some of the a^j as zero. The coefficients satisfy 

\alj\ < (l-f |/|>^|)^^^ which, in combination with the properties of the Haar functions, 
shows that 

AkH^)^} aKix,y)f{y)dy, HaKlU < 1- 
Jk 

It is further clear that UIJ^ is a shift with parameters (w, u) = (i + r^r), and 

2-ia/2 < 2-max(u,-u)Q/2^ sincc T is a fixed number. The goodness conditions for the 

shift follow for J directly from the the goodness of J, and for / from the fact that 

/ C J so that distil, dK) > dist{J,dK) > (.{J)^ £{KY-^ . 

4.C. The paraproduct. It is time to treat the part of Ein which was left over 
after the extraction of the shifts Illijj above. Making the suppressed summation 
conditions explicit, it is 

n n 

^(T*1,D,/) Y. {'^J9)i-^iJ, 
eiJ)>2'-e(i) 

{I, J) good 

where the conditions that J D I and (J, J) be good may as well be dropped from 
the last sum, since otherwise {Djg)j = or tt/j = 0. Now I resort to the fact 
that it is the expectation E^ of this quantity which ultimately matters, and it is 
also important to recall the precise definition of the numbers tt/j. (A predecessor 
of the following computation is found in [Qj Section 9].) Abbreviating temporarily 
T/,7 := (T*1,D//) {Bjg)i, this leads to the expression 

n n 

e(j)>2''e(i) 

n 

= ¥.p Y. ^/+/3. J+;3 E^ [lgood(/3) {I+Pm : 2-^' < i{J)] lgood(/^) ( J+/3) 



t.(J)>2'-l(I) 



Y MTi+p,j+,3Mhoodip){i+mj ■■ 2-^' < e{J)]] mhood(p)iJ+i3)] 



e{j)>2'-e{i) 



Y E/3 [^/+0,J+/3 lgood(/3) (-^+/3)] TTgood 



i{j)>2'-i{i) 
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n n 

''°° i(j)>2^ e-(i) 

where the natural condition that J Z) I was reimposed to avoid unnecessary zeros 
in the sum. 

In the inner sum, (Bj.g)/ = {g)j' - {g)j, where I D J' e ch(J) and £{J) < 2"™. 
Recalhng the abuse of notation when i{J) = 2^™, when Dj in fact stands for 
Dj + Ej, there holds ((Bj + ¥.j)g)j = {g)j' in this case. Thus the summation over 
J (if nonempty) is telescopic, and collapses to {g)j(r} . For simplicity of notation, let 
{g)j be abuse notation for zero in the case of an empty sum, i.e., when i{J) > 2^™. 
After collapsing the telescope as explained, the computation continues by essentially 
reversing what was done above, but with the collapsed double sum: (A useful 
temporary abbreviation now is Tjj :~ (T*!, D//) (5) j l(,h'-(j)(-/^), where the last 
factor is one if and only if / C J with i{T) == 2^^£{J).) 



TTgoodE^ J2 {T*l,Bi.f){g)ii.-, 



^e®food 



(7)=2-'-£(J) 



good 



= 7I"good 2^ E,3[rf_j_^ j4_^lgood(,3)(^+/3)] 



i{j)=2-e{i) 



J2 MTi+p,J+pMhoo<iif»{I+Pm ■■ 2"-'' < ^(J)]] %[lgood(«(^+/3)] 



i(J)=2-i(I) 

n 

% Y ^/+/3,./+0^f+/3,£(,/)lgood(0) (•/+/?) 

£(J)=2'^£(7) 

n n 

% E E {9)j-{T*l,Bjf)-njj. 



■^e^food i<^3>f^jcJ 



In order to interpret this as an average of good dyadic shifts, one still needs to 
introduce the new summation variable K :— J^'^\ leading to 



= ^/^ E E E {9)j-{T*l,Bjf)-njj 

K£@f jfz,^^ ^,JCK le&'^.ICJ 

i{j)=2-'-i{K) e{i)=2'''^e{K) 
-E^^g, Y AK.f)=:^f){g,Tl*f), 
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where H* is a dual paraproduct operator. Note that the kernel of Axfix) = 
j^j^aK{x,y)f{y)dy is 

aK{x,y) ^\K\Y,J2^^- {T*l,h])-njj-h]{y), 

where the summation conditions are the same as above, and \(T*l,h'l)\ < \I\^^^ 
since T*l e BMO. As |i4:|/| J| = 2'', it follows that ||aK||oo < 1, as required. Also, 
the goodness of J ensures that dist( J, dK) > £{J)''£{K)^^'^ , and the same estimate 
follows for / simply because / C J. This completes the verification that 11* is a 
good dyadic shift with parameters {u,v) = (2r, r). 

4.D. The term Snear- Here the summation conditions are 2~^£{J) < £{I) < £{J) 
and dist(/, J) < £{I), which implies that in fact dist(/, J) = 0. Splitting the sum 
according to the value of i = 0, 1, . . . , r such that £{I) ~ 2~^£{J), the goodness of 
J implies that J C K :— /(''+'', which can be taken as a new summation variable. 

r 

where 

i(.J)=2-''l(K) ^(-'")=2 " '^(-f^) 

and, simply by the boundedness of T on L^(M^), 

\afj\^\{hOj,TH'^,)-nij\<\\h<^jh\r,h^l. 

Using the size of the Haar functions and the fact that both / and J are essentially 
of the same size as K , it follows that Ak has the right size. 

The goodness of J implies that dist(J, 9^) > £{J)^£{K)^~'' and, using that 
dist(/, J) = 0, 

dist(/, dK) > dist( J, dK) - £{J) 

> £{J)^ £{K)^~^ {I - 2-'-(i-'')) > \£{J)'< £{K)^-'< . 

Thus IIIj^(,jjj. is a good dyadic shift with parameters (u, v) = [r + i, r). 

4.E. Completion of the decomposition. In the part of the sum martingale 
difference representation with i{I) > £{J), one can perform completely analogous 
considerations as above on the dual side, leading to a series of pairings (III5,/), 
where III is a good dyadic shift. However, the definition of a good shift is self-dual, 
in the sense that HI* satisfies all the conditions if and only if IH does. Hence, 
simply writing (IH5, /) — {g, HI*/) in each summand of the dual series, even this 
part attains the required form. As a curiosity, it may be observed that the part with 
£{I) > £{J) gives shifts with parameters {u, v) such that u < v, whereas £{I) < £{J) 
gave u > V. Indeed, the adjoint of a shift with parameters (w, v) is a shift with 
parameters {v,u). 

Theorem 14.21 still claims the flniteness and the uniform boundedness of all the 
appearing shifts IH. The finiteness is clear from the fact that these shifts are 
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constructed by reorganising the finite sums 2, from the martingale difference 

representation. Concerning the uniform boundedness on (the unweighted!) LP', 
this may be easily extracted from Nazarov, Treil and Volberg's proof of the nonho- 
mogeneous Th theorem [19j, in which this decomposition is implicitly performed. 
It is also not difficult to give a direct proof in the present homogeneous situa- 
tion; however, somewhat different considerations are required for the cancellative 
shifts, which involve the noncancellative Haar functions on at most one level, and 
the paraproducts, where noncancellative Haar functions are present on all length- 
scales. But once this unweighted boundedness is known, the weighted estimates for 
the different shifts can be established in a uniform manner, without distinguish- 
ing the paraproducts from the other kinds of shifts. The proof of Theorem 14.21 is 
complete. 

5. Unweighted end-point estimate for the shifts 

The basic unweighted estimate for the dyadic shifts is the uniform (in the shift 
parameters) boundedness on L^, which was made a part of Definition 14.11 above. 
The next step is proving appropriate weak-type bounds in L^. This is the same 
general strategy as in Lacey-Petermichl-Reguera [13] ; the novelty consists of im- 
proving the exponential dependence on the shift parameters to a linear one. 

5.1. Proposition. A hounded dyadic shift with parameters {u,v) maps L^ into 
2^1,00 ^j^/j norm 0{u). 

Proof. This is a rather classical-style argument based on the Calderon-Zygmund 
decomposition. Given / G L^(R^), let g and h be its good and bad parts with 
respect to height A and the dyadic system 2! related to the particular shift; i.e., 
b — f — g = y^ bL with bi := 1l(./ — {f)L), where i e ^ C ^ are the maximal 

Le.is 
dyadic cubes with J, |/| dx > A. As usual 

|{|m/| > A}| < iiim^i > iA}| + |{|m6| > iA}|, 

KllUgl > iA}| < 4A-2||m5||^ < \-^\\g\\l < X-'\\fh, 
and 

L L K 

A necessary condition for AxbL ^ is K f) L y^ 0, which means that K C L or 
K D L. But, if £{K) > 2"^(L), then the kernel aK{x,y) of Ak, as a function of y, 
is constant on all / G ^ with £{I) ~ i{L), and in particular on L. Since J b^ — 0, 
it follows that AxbL = also in this case. Thus 



^AKbL= ^ AxbL + ^A^i^^bL. 

K KCL i=l 

The first sum is supported on L, and the second contains just u summands. Hence 

u 

\{\mb\ > iA}| < I U ^1 + 1{| E E^^«^^| > 5a}|, 



Less L£SSi=l 
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where the first term in bounded in the standard way by ^ |L|<A ||/||i- 
The second term is estimated as follows: 



Le.'iS i=l Le.® J=l 

^iEEii^-iii=^?Eii^-iii^?ii/ii- 



Less i=l Less 

where the uniform L^-boundedness of the averaging operators Ak was used in the 
third-to-last step. D 

6. The weighted testing conditions in terms of shifts 

It was explained in the Introduction that the Perez-Treil-Volberg result [21] 
reduced the proof of the A2 conjecture to the verification of the testing condition 

\\T{wiQ)\\m,) < MaMQ)'^' 

for all cubes Q C M^. The left side is the supremum over all normalised, compactly 
supported (thanks to density) / S L^(w) of 

00 

{f,T{wiQ))^ hm E^ V 2---("-'')"/2(/,m;:;^(u;iQ)). 

u^v—r 

Therefore, it suffices to prove the corresponding testing estimate 

l|nir^(^iQ)IU.(<,) < ^u,v)\\wUMQy^', 

with some ^{u,v) such that the series X]^u=r ^ ™'^'^*'"'''''"^^'^("'^) i*^ summable. 
Note that the cube Q in this testing condition is completely arbitrary; it does not 
in general belong to the (also arbitrary) dyadic systems appearing in the definition 
of the dyadic shift. 

The rest of the paper is dedicated to proving the following estimate, from which 
the required summability follows (thanks to a/ 2 — 'yN/2 > a/4 > 0), thereby 
verifying the A2 conjecture. 

6.1. Theorem. Let UI — y Ak be a good, finite, bounded dyadic shift with 

KeS! 
parameters (u,v). Then 

\\U1{^^q)\\lh.) < 2--(-'^)^^/2^H|^|U,u;(Q)i/2 
for all cubes Q C M^. (The exponential factor is unnecessary if Q € '3) .) 
As before, AK(yj\q) can only be nonzero if X n Q 7^ 0, and therefore 
ni(u;lQ)= Y. ^k{w\q)= Y. ^k{w\q)^ Y. ^k{w1q). 

K:KnQ^0 K:KnQ^0 K:KnQ^0 

i{K)>l(Q) i{K)<l(Q) 



20 T. P. HYTONEN 

The large scales is by far the easier part of the estimate, and in fact uniform with 
respect to the shift parameters: 

OO OO /„^ 

IE E ^-Mq)|^E E ^1- 

fc=0 K:KnQ=^0 k=0 K:KnQ=i0 ' ' 

t{K)=2''l{Q) t(K)=2''t(Q) 

<^13Q + M(w;1q)1(3Q)c. 
For the first term on the right, 

^r ^"^ ^^(^) ^ ^"^'"^^ - "^"^^ i I3QI I3QI ) 

< \\w\\]1^w{QY/'. 
And for the second, as a direct application of Buckley's estimate [3, Theorem 2.5] 

I|M/IIl^M < IklUJI/IU^M, (6.2) 

it follows that 

The main part of the argument consists of handling the small scales. 

7. The main estimates 

This section contains the core inequalities behind the A2 conjecture. They fol- 
low quite closely the innovative estimates originally due to Lacey, Petermichl and 
Reguera [T3], which gave the analogue of the A2 conjecture for individual dyadic 
shifts. However, in order to obtain bounds with admissible dependence on the 
shift parameters, a number of modifications are needed here and there, so it seems 
appropriate to present the argument in full detail. It is also worth recalling the 
additional difficulty here that the cube Q need not be dyadic; this is to some extent 
compensated by goodness of the shift under consideration, as will be apparent in 
the very last Lemma [7771 below. 

With the dyadic shift of interest. III — \J Ak, fixed for the moment, let 

m^ := Y. ^^' 
whenever "^^ C ^ is a subset. With this notation, the goal is to estimate 

^{Ke@:KnQ^0,liK)<l{Q)}{'wlQ)- 

In fact, since III is good, which means that the kernel of each Ak is supported only 
on the subset 

K — {xeK: dist{x,dK) > 2-"ia>=("'''h^(if)}, 

the condition that Ak{wIq) ^ implies that even iC n Q ^ 0. Letting 

JT := {/f e ^ : X n Q 7^ 0, 1{K) < £(Q)}, 

the task is reduced to proving that 

\\m^^{wlQ)\\ma) < 2--("'")^^uz;|l«;|U,z«(Q)i/2. (7.I) 
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7. A. Pigeonholing a la Lacey et al. The bound (|7.ip will be accomplished by 
carefully partitioning the collection ,J^ into appropriate subsets, where the weights 
w and a are well under control — a procedure introduced by Lacey, Peterniichl and 
Reguera [13j. This consists of several steps: 

(1) The collection Jif is partitioned into w + 1 subcollections simply according 
to the value of log2 ^iK.^ mod "u + 1. This is the step which introduces the 
factor V into the estimate. Henceforth, an arbitrary but fixed subcoUection 
like this will be considered, and with slight abuse still denoted by Jif . 
Note that Ak(w\q\ which is a linear combination of Haar functions on 
cubes J ^ '3 with £(J) = 2~'"i[K'). is constant on dyadic cubes of length 
2-^-^i{K), and hence on ah cubes K' (^ JfT with 1{K') < £{K). 

(2) The local A2 characteristic is essentially fixed by considering the subsets 
je°- of those K e J^ with 

wjKnQ) a{K) , 

\K\ ■ \K\ -^ ' 

where a g Z with a < logj ||w||a2- 

(3) Among each J^", a subset of stopping cubes S^"' = IJ^q ^^ is constructed 
as follows: S''^ consists of all maximal (with respect to set inclusion) K € 
■J€°- , and then inductively ^^^-^ consists of all maximal K g Jif^ such that 

w(K^Q) w{SC\Q) 
m ^ \S\ 

for some S e ^^" with S D K. For K € JiT", let K'' stand for the minimal 
stopping cube S G ^^ with S ^ K. Then the collections 

J^T'^iS) := {K e J^T"" : K' ^ S}, S £ S^", 

form a partition of J^"^. (Constructions of this type are known in the 
literature under different names, including "principal cubes" and "corona 
decompositions.") 

(4) Finally, yet another measure ratio is essentially fixed by considering the 
subcollections ^^"(5*) of those K e jr"(S') with 

\S\ < \K\ -' |5| ' '''^- 

Note that for K e ^"(5), there holds 

^JK) \K\ \S\ ^ 

\K\ ~ w{KnQ) ^ w{SnQ) ■ ^y )^ 

so the a and Lebesgue measures are essentially comparable, with their ratio 
depending only on a, & and S. 

The proof of (|7.ip then starts by writing 



a:2»<||ji>||^., Sey 

E 

a:2<'<||iu|Uo Se^' 



^ E (/I E ui^"(s)Mq 



2 xl/2 



< /^ 
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1/2 



2 E J2 f\m^^.(^s){^lQ)\-\mj^.(^s'){wlQ)\a 

flc V^ c'^ a>a J 



s'cs 

It is further observed that ah K £ Jf°-{S) are either disjoint from or strictly contain- 
ing any S' S ^" with S" C S; hence ah these ^^^^(wIq), and thus III j^a(5) (wig) 
itself, are constant on S' . Thus 

/ |IIIjr°(s)(wlQ)| • |mje'"(s')("'lQ)l^ 

= l(III,Jfr»(S)(wlQ))s'| / |UIj^a(5,)(w;lQ)|cr 
The next task is obtaining useful bounds for the integral on the right. 

7.B. John— Nirenberg-type estimates. The goal is to estimate the size of the 
set where 

|m^a(5)(zi;lQ)| >t, 

both with respect to the Lebesgue and a measures. The available information is the 
weak-type L^ bound for the dyadic shifts, and the Lebesgue measure estimate could 
be deduced directly from this by a usual John-Nirenberg-type argument. However, 
in order to smoothen the passage to the a measure estimate, it is useful to first 
consider the shifts restricted to the collections ,J^^{S), where the two measures are 
comparable. 



7.2. Lemma. For a good, finite, bounded dyadic shifi. III with parameters {u,v), 
the following estimates hold when v is either the Lebesgue or the a measure: 

\s\ 

where c > is a constant. 



i.({\m^.^s)iwlQ)\ > u2-'^^^^^ ■ t}) < e-MS). t > 0, 



Proof. Let A := Cu2 ^w{S n (5)/|5'|, where C is a large constant, and n G Z_|_. Let 
X € R^ be a point where 

\U1,_^^.^s)(wIq)(x)\ > nX. (7.3) 

Then for all small enough L G ^°(5) with L 3 x, there holds 



J2 Ak{w1q){x) 



KDL 



> nX. 



Since N, Ak{wIq)(x) is constant on L, and 



KDL 



it follows that 



\Al{wIq)\\oo ^ 1^] ^2 — , (7.4) 



J2 AKiwlQ)\>{n-l)X onL. (7.5) 

KDL 
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Let ^ C ^^[S) be the collection of maximal cubes with the above property. Thus 
all L G ^ are disjoint, and all x with (|7.3I) belong to some L. By maximality of L, 
the minimal L* G .y(f^{S) with L* Z) L satisfies 



Y. Ak{wIq) <{n-\)\ onL* 



KDL' 



>iA. 



By an estimate similar to (|7.4p . with L* in place of L, it follows that 
I Y^ Ak(wIq) <(n-|)A onL. 

KDL 

Thus, if X satisfies (I7.3P and x G L G ^, then necessarily 

\'^{Ke.x--{S)-MCL}{wlQnL)ix)\ = I Y AK{wlQ){a 

Keje.'^iS) 

KCL 

Using the weak-type L^ estimate, which is uniform over all bounded dyadic shifts 
with parameters (m,u), it follows that 

{| Y AK{wlQ)(x)\>^^x]\<^w{LnQ) 

KCL 

provided that the constant in the definition of L was chosen large enough. Recalling 
(fTS)) . there holds 

K^j(r-{S) 



Kejff,^{s) 

KDL 



Ke^,^(S) 

KCL 



Thus 



> (n - i)A -\X = {n- |)A on L C L with \L\ > f |L| 



|mje;»(S)MQ)l >nA}| < Y |in{|m^^a(s)MQ)| >nA}| 



Le^ 



< 



J2 \{\'^{Keje,-isy.KCL}iwlQ)\ > iA}| 



LCSC 



<Ym<j:i-m 



LCS? 



Lcse 



<\Y. |in{|m^^.(5)(w;lQ)|>(n-l)A}| 



Le^ 



<i|{|m,;^^a(S)(u;lQ)|>(n-l)A}|. 
By induction it follows that 

\{\m,^.(s){wlQ)\ > nX}\ < 2-'''\{\m^.^s){wlQ)\ > 0}| 

<2-" Y l^^l <2""|5|, 



MCJ( 



where ^ is the collection of maximal cubes in J^° (5*) . 
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To deduce the corresponding estimate for the a measure, selected intermediate 
steps of the above computation, as well as the definition of J?^°(5), will be exploited: 

a{{\m,^.(S){v^lQ)\ > 7i\}) < J2 <L) < J2 r^iS)\L\ 

<T^{S)\{\m^.^s)iwlQ)\>in~l)X}\ 
<T^{S)2-- ^ \M\ 

< 2-" J2 '^(^^) ^ 2""ct(5). D 

It is an immediate consequence that a similar estimate holds for the bigger 

oo 

collections Jf''{S) = [j .T^'^iS); indeed 

f)=0 



where the computation is valid at least for i > 2, and the conclusion is trivial 
otherwise. The final conclusion, for both measures, is that 

\m^.(S){wlQ)\^du<(u^^^^)\{S), PG[1,^). (7.6) 

7.C. Conclusion of the proof. Returning to the estimation of ||IIIjf^(wlQ)||i2(CT'), 
it has so far been shown that 

||IIIx-(w1q)||l2(^) 

1/2 



2 E E l(m^<.(5)MQ))s'l/|ni,^'.(S')MQ) 



s'cs 



Substituting the estimate (|7.6p with v = a and p = 1,2, this continues with 

^ E i E r^s^) '^(^^ 

+ 1^ 1^ \(^-y(r-(s){wlQ))s'\[u — jcr(5)' 

s'cs 
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and recalling the freezing of the local A2 characteric in the definition of J^°, 

s'cs 
Concentrating for the moment on the last term, 

E 1(111 jr''(5)MQ))s'||5'|< E [ ^m^'^is){wlQ)\dx 
s'cs s'cs 






s'ay 
s'cs 



<\\ E Is' ^J|m^.(s)(u;lQ)|U2. 

M — L^ 

S'G.y 

S'cs 

The first factor is bounded by jS*!^/^, as one easily checks from the construction of 
the stopping cubes: those 5' C S* of the first generation are disjoint, and 

E l^'l < E i"'(^' n Q) ,f' , < l^is n Q) f' = 1|^|; 

^ ^4 w{br\Q) 4 110(5 r\Q) 4 

one simply repeats this for the consecutive generations and sums up a geometric 
series. The second factor may be estimated by (|7.6I) with the Lebesgue measure 
and p = 2, to the result that 

Thus, altogether 

E \{m^r^^s){^iQ))s' I l^'l <u-w{sn g), 

s'e^" 
s'cs 

and then 

mMwlQ)\\LH.)<u E 2'^/2( ^ ^5nQ))'^'. 

2^<\\w\\a2 se,y 

The proof is completed by the following lemma, for then 

2"<II«'IU2 

recall that the final estimate will also involve the factor v resulting from summing 
up the x; + 1 subcollections in the first step of the pigeonholing. 

7.7. Lemma. 
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Proof. Recall the notation K from the beginning of this section, right before (|7.1|) . 
Every K € Jf satisfies KnQ j^ and £{K) < £{Q), which imply that KCiQ must 
contain a cube of sidelength 2-'"^'=("''')'^£(X), thus of volume 2^"i^'=("^")t^|/s:|. 
This holds in particular for every S G ^° C J{f. Hence 






For a fixed point x, the construction of the stopping cubes ensures that the ratio 
'w{S r\Q)/\S\ along S 3 x increases at least geometrically, and hence their sum is 
dominated by the maximal value, which in turn is dominated by M{w1q){x). Thus 



E 



'^^^^^\s{x)dx< I MMQ)da;<||M(u;lQ)|U2(,)||lQ|U.(„) 



< ||a|UJ|u;lQ||L2(.)w(Q)'/' = MaMQ) 
by an application of Buckley's estimate (|6.2p . D 

Note that if Q e ^, then all K e ,J(^ satisfy K C Q. hence K nQ ^ K, and 
the introduction of the exponential factor, as well as the use of the goodness of the 
shift at this point, is unnecessary. 

8. Discussion 

8. A. A shorter proof of the A2 conjecture? At the present, a self-contained 
proof of the A2 conjecture would consist of the almost 40 pages of Perez, Treil 
and Volberg's reduction to the weak- type estimate [21], combined with the present 
argument to provide this last missing information. It is perhaps interesting that 
both these steps go through a T(l) theorem for a Calderon-Zygmund operator, 
and using a Haar wavelet basis; however, one adapted to the measures w and a in 
Perez-Treil- Volberg's part [^ , and the standard one in the present contribution. 

While it gives the desired result, this combination might be a bit of overshooting: 
since the present argument already reduces things to the dyadic shift operators, it 
should philosophically be enough to use a weight-adapted T(l)-theorem for these 
shifts, rather than for general Calderon-Zygmund operators. And for dyadic op- 
erators, it should ideally be enough to verify the weighted testing condition for 
dyadic cubes only, which would somewhat simplify the preceding analysis. Indeed, 
a result of this flavour is provided by Nazarov-Treil- Volberg's two-weight inequal- 
ity for dyadic shift operators [2^ (which lies behind Lacey-Petermichl-Reguera's 
result [13]). But in order to apply it to the desired conclusion, one would need to 
keep track of the dependence of their estimate on the shift parameters, to ensure 
the required summability in the end, whereas the Perez-Treil-Volberg result |5T] 
may be directly applied as a black box. 

It would also be interesting if the Lerner's formula -based Cruz-Uribe-Martell- 
Perez approach [5] to the Lacey-Petermichl-Reguera estimate [13] could be im- 
proved so as to have summable dependence on the shift parameters. 
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8.B. Possible extensions. The representation of a Calderon-Zygmung operator 
as an average of good dyadic shifts is an identity, which has no specific connection 
to A2 weights, and may be useful for proving other bounds as well. In particular, it 
is likely that the same proof strategy is also applicable to providing sharp weighted 
weak-type L^ bounds for general Calderon-Zygmund operators, in a similar way 
as the the Lacey-Petermichl-Reguera argument was extended to weak-type L^ 
bounds for dyadic shifts [llj and smooth Calderon-Zygmund operators [lOJ by 
Lacey et al. This would involve verifying the weak-type testing condition of Lacey- 
Sawyer-Uriarte-Tuero [H] , which is very similar to the Perez-Treil-Volberg testing 
condition [21j checked in this paper. The main difference is that the Lacey-Sawyer- 
Uriarte-Tuero condition requires the estimation of the maximal truncations of T, 
rather than just the operator itself; on the other hand, the conclusions of their 
theorem are then valid for the maximal truncations as well. 

Perez, Treil and Volberg assert that their result extends to Calderon-Zygmund 
operators on spaces of homogeneous type [^T, Section 12]. It is likely that the 
present argument will do so as well. In particular, the dyadic cubes in this generality 
have already been constructed by Christ |4] , and the required randomisation of this 
construction was recently carried out by Martikainen and the author [12]. The 
present arguments also made use of some specific symmetries of the Euclidean 
space, especially the fact that the probability of a cube being good is constant. 
A trick to ensure this even in a metric space has been presented by Martikainen 
|17| . One would still need to check whether the computation of the conditional 
probabilities, which here employed the explicit form of the randomisation in terms 
of the binary variables l3j , is compatible with the abstract randomisation procedure 
in a metric space. The actual estimates for the shifts above mainly relied on the 
abstract dyadic structure, and would probably extend reasonably straightforwardly. 
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